AITopics | reduction technique

Collaborating Authors

reduction technique

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bootstrap Model Aggregation for Distributed Statistical Learning

JUN HAN, Qiang Liu

Neural Information Processing SystemsMar-23-2026, 03:04:44 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, estimator, machine learning, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.82)

Add feedback

1e8a19426224ca89e83cef47f1e7f53b-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 18:36:02 GMT

clique 0, linear piece, sqrt 0, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

Reduction Techniques for Survival Analysis

Piller, Johannes, Orsini, Léa, Wiegrebe, Simon, Zobolas, John, Burk, Lukas, Langbein, Sophie Hanna, Studener, Philip, Goeswein, Markus, Bender, Andreas

arXiv.org Machine LearningAug-11-2025

In this work, we discuss what we refer to as reduction techniques for survival analysis, that is, techniques that "reduce" a survival task to a more common regression or classification task, without ignoring the specifics of survival data. Such techniques particularly facilitate machine learning-based survival analysis, as they allow for applying standard tools from machine and deep learning to many survival tasks without requiring custom learners. We provide an overview of different reduction techniques and discuss their respective strengths and weaknesses. We also provide a principled implementation of some of these reductions, such that they are directly available within standard machine learning workflows. We illustrate each reduction using dedicated examples and perform a benchmark analysis that compares their predictive performance to established machine learning methods for survival analysis.

artificial intelligence, machine learning, reduction technique, (18 more...)

arXiv.org Machine Learning

2508.05715

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Norway > Eastern Norway > Oslo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(7 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Research Report > Strength High (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.94)

Add feedback

Evaluating Ensemble and Deep Learning Models for Static Malware Detection with Dimensionality Reduction Using the EMBER Dataset

Abedin, Md Min-Ha-Zul, Mehrub, Tazqia

arXiv.org Artificial IntelligenceJul-28-2025

This study investigates the effectiveness of several machine learning algorithms for static malware detection using the EMBER dataset, which contains feature representations of Portable Executable (PE) files. We evaluate eight classification models: LightGBM, XGBoost, CatBoost, Random Forest, Extra Trees, HistGradientBoosting, k-Nearest Neighbors (KNN), and TabNet, under three preprocessing settings: original feature space, Principal Component Analysis (PCA), and Linear Discriminant Analysis (LDA). The models are assessed on accuracy, precision, recall, F1 score, and AUC to examine both predictive performance and robustness. Ensemble methods, especially LightGBM and XGBoost, show the best overall performance across all configurations, with minimal sensitivity to PCA and consistent generalization. LDA improves KNN performance but significantly reduces accuracy for boosting models. TabNet, while promising in theory, underperformed under feature reduction, likely due to architectural sensitivity to input structure. The analysis is supported by detailed exploratory data analysis (EDA), including mutual information ranking, PCA or t-SNE visualizations, and outlier detection using Isolation Forest and Local Outlier Factor (LOF), which confirm the discriminatory capacity of key features in the EMBER dataset. The results suggest that boosting models remain the most reliable choice for high-dimensional static malware detection, and that dimensionality reduction should be applied selectively based on model type. This work provides a benchmark for comparing classification models and preprocessing strategies in malware detection tasks and contributes insights that can guide future system development and real-world deployment.

artificial intelligence, machine learning, malware detection, (17 more...)

arXiv.org Artificial Intelligence

2507.16952

Country: Asia > Bangladesh (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Optimization of embeddings storage for RAG systems using quantization and dimensionality reduction techniques

Huerga-Pérez, Naamán, Álvarez, Rubén, Ferrero-Guillén, Rubén, Martínez-Gutiérrez, Alberto, Díez-González, Javier

arXiv.org Artificial IntelligenceMay-2-2025

Retrieval-Augmented Generation enhances language models by retrieving relevant information from external knowledge bases, relying on high-dimensional vector embeddings typically stored in float32 precision. However, storing these embeddings at scale presents significant memory challenges. To address this issue, we systematically investigate on MTEB benchmark two complementary optimization strategies: quantization, evaluating standard formats (float16, int8, binary) and low-bit floating-point types (float8), and dimensionality reduction, assessing methods like PCA, Kernel PCA, UMAP, Random Projections and Autoencoders. Our results show that float8 quantization achieves a 4x storage reduction with minimal performance degradation (<0.3%), significantly outperforming int8 quantization at the same compression level, being simpler to implement. PCA emerges as the most effective dimensionality reduction technique. Crucially, combining moderate PCA (e.g., retaining 50% dimensions) with float8 quantization offers an excellent trade-off, achieving 8x total compression with less performance impact than using int8 alone (which provides only 4x compression). To facilitate practical application, we propose a methodology based on visualizing the performance-storage trade-off space to identify the optimal configuration that maximizes performance within their specific memory constraints.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2505.00105

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.86)
Information Technology > Data Science > Data Mining > Big Data (0.62)

Add feedback

Forward-Cooperation-Backward (FCB) learning in a Multi-Encoding Uni-Decoding neural network architecture

Dutta, Prasun, Ghosh, Koustab, De, Rajat K.

arXiv.org Artificial IntelligenceFeb-27-2025

The most popular technique to train a neural network is backpropagation. Recently, the Forward-Forward technique has also been introduced for certain learning tasks. However, in real life, human learning does not follow any of these techniques exclusively. The way a human learns is basically a combination of forward learning, backward propagation and cooperation. Humans start learning a new concept by themselves and try to refine their understanding hierarchically during which they might come across several doubts. The most common approach to doubt solving is a discussion with peers, which can be called cooperation. Cooperation/discussion/knowledge sharing among peers is one of the most important steps of learning that humans follow. However, there might still be a few doubts even after the discussion. Then the difference between the understanding of the concept and the original literature is identified and minimized over several revisions. Inspired by this, the paper introduces Forward-Cooperation-Backward (FCB) learning in a deep neural network framework mimicking the human nature of learning a new concept. A novel deep neural network architecture, called Multi Encoding Uni Decoding neural network model, has been designed which learns using the notion of FCB. A special lateral synaptic connection has also been introduced to realize cooperation. The models have been justified in terms of their performance in dimension reduction on four popular datasets. The ability to preserve the granular properties of data in low-rank embedding has been tested to justify the quality of dimension reduction. For downstream analyses, classification has also been performed. An experimental study on convergence analysis has been performed to establish the efficacy of the FCB learning strategy.

algorithm, dataset, meud-ff-coop, (14 more...)

arXiv.org Artificial Intelligence

2502.20113

Country:

North America > United States (0.14)
Asia > India > West Bengal > Kolkata (0.04)
Africa > Mali (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MRI Patterns of the Hippocampus and Amygdala for Predicting Stages of Alzheimer's Progression: A Minimal Feature Machine Learning Framework

Patra, Aswini Kumar, Devi, Soraisham Elizabeth, Gajurel, Tejashwini

arXiv.org Artificial IntelligenceJan-10-2025

Alzheimer's disease (AD) progresses through distinct stages, from early mild cognitive impairment (EMCI) to late mild cognitive impairment (LMCI) and eventually to AD. Accurate identification of these stages, especially distinguishing LMCI from EMCI, is crucial for developing pre-dementia treatments but remains challenging due to subtle and overlapping imaging features. This study proposes a minimal-feature machine learning framework that leverages structural MRI data, focusing on the hippocampus and amygdala as regions of interest. The framework addresses the curse of dimensionality through feature selection, utilizes region-specific voxel information, and implements innovative data organization to enhance classification performance by reducing noise. The methodology integrates dimensionality reduction techniques such as PCA and t-SNE with state-of-the-art classifiers, achieving the highest accuracy of 88.46%. This framework demonstrates the potential for efficient and accurate staging of AD progression while providing valuable insights for clinical applications.

accuracy, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.05852

Country: Asia > India (0.47)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Linear Discriminant Analysis in Credit Scoring: A Transparent Hybrid Model Approach

Reza, Md Shihab, Mahmud, Monirul Islam, Abeer, Ifti Azad, Ahmed, Nova

arXiv.org Artificial IntelligenceDec-5-2024

The development of computing has made credit scoring approaches possible, with various machine learning (ML) and deep learning (DL) techniques becoming more and more valuable. While complex models yield more accurate predictions, their interpretability is often weakened, which is a concern for credit scoring that places importance on decision fairness. As features of the dataset are a crucial factor for the credit scoring system, we implement Linear Discriminant Analysis (LDA) as a feature reduction technique, which reduces the burden of the models complexity. We compared 6 different machine learning models, 1 deep learning model, and a hybrid model with and without using LDA. From the result, we have found our hybrid model, XG-DNN, outperformed other models with the highest accuracy of 99.45% and a 99% F1 score with LDA. Lastly, to interpret model decisions, we have applied 2 different explainable AI techniques named LIME (local) and Morris Sensitivity Analysis (global). Through this research, we showed how feature reduction techniques can be used without affecting the performance and explainability of the model, which can be very useful in resource-constrained settings to optimize the computational workload.

accuracy, lda, reduction technique, (14 more...)

arXiv.org Artificial Intelligence

2412.04183

Country:

North America > United States > New York (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Banking & Finance > Credit (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Explainable Malware Detection through Integrated Graph Reduction and Learning Techniques

Mohammadian, Hesamodin, Higgins, Griffin, Ansong, Samuel, Razavi-Far, Roozbeh, Ghorbani, Ali A.

arXiv.org Artificial IntelligenceDec-4-2024

Control Flow Graphs and Function Call Graphs have become pivotal in providing a detailed understanding of program execution and effectively characterizing the behavior of malware. These graph-based representations, when combined with Graph Neural Networks (GNN), have shown promise in developing high-performance malware detectors. However, challenges remain due to the large size of these graphs and the inherent opacity in the decision-making process of GNNs. This paper addresses these issues by developing several graph reduction techniques to reduce graph size and applying the state-of-the-art GNNExplainer to enhance the interpretability of GNN outputs. The analysis demonstrates that integrating our proposed graph reduction technique along with GNNExplainer in the malware detection framework significantly reduces graph size while preserving high performance, providing an effective balance between efficiency and transparency in malware detection.

artificial intelligence, graph, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2412.03634

Country:

North America > United States > California > Los Angeles County > Pasadena (0.04)
North America > Canada > New Brunswick > York County > Fredericton (0.04)
North America > Canada > New Brunswick > Fredericton (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ClusterGraph: a new tool for visualization and compression of multidimensional data

Dłotko, Paweł, Gurnari, Davide, Hallier, Mathis, Jurek-Loughrey, Anna

arXiv.org Machine LearningNov-8-2024

Understanding the global organization of complicated and high dimensional data is of primary interest for many branches of applied sciences. It is typically achieved by applying dimensionality reduction techniques mapping the considered data into lower dimensional space. This family of methods, while preserving local structures and features, often misses the global structure of the dataset. Clustering techniques are another class of methods operating on the data in the ambient space. They group together points that are similar according to a fixed similarity criteria, however unlike dimensionality reduction techniques, they do not provide information about the global organization of the data. Leveraging ideas from Topological Data Analysis, in this paper we provide an additional layer on the output of any clustering algorithm. Such data structure, ClusterGraph, provides information about the global layout of clusters, obtained from the considered clustering algorithm. Appropriate measures are provided to assess the quality and usefulness of the obtained representation. Subsequently the ClusterGraph, possibly with an appropriate structure--preserving simplification, can be visualized and used in synergy with state of the art exploratory data analysis techniques.

clustergraph, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2411.05443

Country:

Europe > France > Hauts-de-France > Oise > Compiègne (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback